CSE 255 Assignment 9
نویسندگان
چکیده
In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected terrorists in a social network. We train and test on a dataset created at the University of Maryland and further modified at UCSD by Eric Doi and Ke Tang [2]. The supposed terrorists have several labels for the nature of their links to other supposed terrorists; terrorists are classified as either colleagues, family, contacts, or congregates. Structural information about the known network connectivity of the supposed terrorists is integrated with additional binary information provided about the individuals to arrive at two final models. The first model predicts the existence of any type of link between two individuals and the second model classifies whether an existing link is ’colleague’ or ’other’. In the link prediction task, our final logistic regression, with per-example cost of 117, generates an average AUC metric 0.93 and on the second link classification task, the final linear logistic regression, with per-example regularization of 33.7, generates a 0.92 0/1 accuracy metric.
منابع مشابه
CSE 255: Assignment 1 - Exploring Musical Tagging
We explore two predictive tasks: (i) a measure of tag probability, and (ii) identifying a minimum tag set for more meaningful music classification on a 100,000 song dataset joined across complementary databases from the 1 Million Song Dataset (“MSD”). We conclude that a tag set size of around 50 tags is most meaningful and report many of our findings/analysis based on the top 50 tags. Using lin...
متن کاملCSE 255 Assignment 2 Cuisine Prediction/Classification based on ingredients
In this paper, we consider different strategies for identifying the cuisine, given its ingredients. This project aims to explore what combination of ingredients is helpful in identifying a cuisine if the recipe is not given. This has been tackled as a problem of cuisine classification. We also explore different classification algorithms in tandem with approaches like taking combination of multi...
متن کاملCSE 255 Assignment 1: Helpfulness in Amazon Reviews
In this paper we consider models for predicting the helpfulness rating of Amazon book reviews. We examine features such as the review’s star rating, the length of the review text, the readability of the review text, and the amount of comparisons made in the review. We compare Support Vector Machine and Random Forests models both for regression and classification.
متن کاملCSE 255 Assignment 2 : Upvotes Prediction for Reddit Submissions
In this paper we consider models for predicting the number of upvotes on a reddit submission. We examine features such as the number of votes, number of comments, time of submission, upvote history of users, images, and subreddits of the submission. We compare Support Vector Regression, Linear Regression, and Gradient Boosting Regression models for predicting the number of upvotes.
متن کامل